Reinforcement Learning for Multi-purpose Schedules

نویسندگان

Kristof Van Moffaert

Yann-Michaël De Hauwere

Peter Vrancx

Ann Nowé

چکیده

In this paper, we present a learning technique for determining schedules for general devices that focus on a combination of two objectives. These objectives are user-convenience and gains in energy savings. The proposed learning algorithm is based on Fitted-Q Iteration (FQI) and analyzes the usage and the users of a particular device to decide upon the appropriate profile of start-up and shutdown times of that equipment. The algorithm is experimentally evaluated on real-life data to discover that close-to-optimal control policies can be learned on a short timespan of a only few iterations. Our results show that the algorithm is capable of proposing intelligent schedules depending on which objective the user placed more or less emphasis on.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach

This paper presents the application of reinforcement learning in automatic analog IC design. In this work, the Multi-Objective approach by Learning Automata is evaluated for accommodating required functionalities and performance specifications considering optimal minimizing of MOSFETs area and power consumption for two famous CMOS op-amps. The results show the ability of the proposed method to ...

متن کامل

Stochastic reinforcement benefits skill acquisition.

Learning complex skills is driven by reinforcement, which facilitates both online within-session gains and retention of the acquired skills. Yet, in ecologically relevant situations, skills are often acquired when mapping between actions and rewarding outcomes is unknown to the learning agent, resulting in reinforcement schedules of a stochastic nature. Here we trained subjects on a visuomotor ...

متن کامل

Supporting Transparent Thread Assignment in Heterogeneous Multicore Processors Using Reinforcement Learning

Heterogeneity in multicore processor systems creates challenges in effectively mapping processes to diverse cores. While most approaches require programmer partitioning between core types or permutation of thread schedules to find the optimal mapping, we introduce a new machine learning approach to automated thread assignment. We train a reinforcement learning agent to assign threads to the bes...

متن کامل

An Evaluation of the Effects of Fixed-Time Schedules on Response Maintenance

Response-independent schedules of reinforcement (e.g., fixed-time schedules) have typically been shown to decrease the rate of responding. However, researchers have suggested that responses may maintain under response-independent schedules, although it is currently unclear as to what mechanisms are responsible for this maintenance. The purposes of the current study were to (a) replicate previou...

متن کامل

Operant conditioning

Operant behavior is behavior "controlled" by its consequences. In practice, operant conditioning is the study of reversible behavior maintained by reinforcement schedules. We review empirical studies and theoretical approaches to two large classes of operant behavior: interval timing and choice. We discuss cognitive versus behavioral approaches to timing, the "gap" experiment and its implicatio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Reinforcement Learning for Multi-purpose Schedules

نویسندگان

چکیده

منابع مشابه

Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach

Stochastic reinforcement benefits skill acquisition.

Supporting Transparent Thread Assignment in Heterogeneous Multicore Processors Using Reinforcement Learning

An Evaluation of the Effects of Fixed-Time Schedules on Response Maintenance

Operant conditioning

عنوان ژورنال:

اشتراک گذاری